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What then determines whether a protein will adopt a globin fold? The 
main-chain conformation appears to be specified by the pattern of nonpolar and 
polar residues. Thirty-two positions in the sequence are nearly always hy- 
drophobic; all these side chains are inside the protein. Surprisingly large 
changes in the volume of these residues can be tolerated. Helices can shift 
their position by several angstroms and their orientation by as much as 30 
degrees to compensate for changes in the volume of die hydrophobic 
core. Another 32 positions are nearly always occupied by a charged or 
polar residue, or by Gly or Ala. These hydrophilic residues are located on 
the surface of the protein. The distinctive hydrophilic versus hydrophobic pat- 
tern of these conserved residues, which comprise nearly half the toted, distinguisJies 
globins from all other proteins. 

Mutagenesis studies of A repressor, a phage protein that controls gene 
expression (p. 959), were carried out to learn the range of allowable 
sequence changes in a functionally important region. A helix of one 
monomer packs against the corresponding helix of another monomer to 
form a dimer that binds to specific sites on DNA and silences gene expres- 
sion (Figure 16-27A). Two or three residues at a time were simultaneously 
mutated by using mixtures of nucleotides to synthesize coclons lor all 
twenty amino acids at these positions. Functional proteins were then se- 
lected and sequenced. The important finding was that most positions can be 
changed loith retention of function (Figure 16-27B). In particular, solvent- 
accessible residues can readily be substituted, whereas buried ones are 
much more conserved. For example, the activity of the repressor was 
unaffected by the replacement of an external glutamate with any of 12 
other amino acids. 
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Figure 16-27 

Structure and mutagenesis of the 
repressor protein of A phage. 
(A) A Repressor hinds as a dimer (yel- 
low and blue) to DNA (green and 
red). Helix 5 (shown in darker color) 
of one subunit interacts with helix 5 
of the other subunit. (B) Functionally 
acceptable residues of a key part of 
helix 5, as determined by mutagene- 
sis. [(A) Drawn from llmb.pdb, 
LJ. Beamer and CO. Pabo./. Mot. 
Biol 227(1992):177. (B) After 
J.F. Reidhaar-Olsou and R.T. Sauer. 
Science 241 (.1988)53.] 
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AN ENCOURAGING START HAS BEEN MADE IN PREDICTING 
THE THREE-DIMENSIONAL STRUCTURE OF PROTEINS 



One of the goals of protein chemistry is to be able to predict the three- 
dimensional structure of a protein, given its amino acid sequence. Predic- 
tion is difficult because (1) a polypeptide chain has a vast number of 
potential conformations and (2) proteins are only marginally stable, 
which means that the free-energy difference between the unfolded and 
folded states is small. No simple code relates the one -dimensional information of 
d sequence to the rich three-dimensional fomi of the folded protein. The folding 
problem is inherently complex. 



434 Part 11 Significant progress is now being made in this challenging area of 

PROTEINS quiry for several reasons. First, the structures of an increasingly I ^ 

number of proteins are being solved by x-ray crystallography and Nvfp 
spectroscopy. More than 100 different protein folds are now known S 
ond, DNA sequencing is providing a wealth of amino acid sequence inf* 
mation. By comparing the sequences of homologous proteins, we °^ 
learn far more about the rules governing structure than from' a sinel" 
sequence. Third, sophisticated computer programs are taking advanta ^ 
of the rapidly enlarging sequence and structure databases. Subtle \m 
terns and relationships can be deduced and displayed. Fourth, predic 
tions of structure can be rapidly tested. Theory and experiment haw 
come together and are mutually reinforcing. 

The first step in structure prediction is to ask whether the sequence of a 
new protein is similar to one whose three-dimensional structure is already 
known. If two proteins are more than 40% identical in sequence, their backbone 
conformations are very likely to be nearly the same in the region of partial identity 
A pair of proteins differing markedly in sequence can have essentially the 
same backbone structure if their hydrophobidty patterns are alike. The struc- 
tural similarity of root nodule leghemoglobin and human hemoglobin is 
a striking illustration (see Figure 16-26). Functional motifs in proteins can 
also be identified by scanning amino and sequences. Calcium-binding EF hands 
(p. 348), for example, can be found by searching for a distinctive pattern 
of hydrophobic and oxygen-containing side chains in successive 29-resi- 
due segments of an amino.acid sequence (Figure 16-28). X-ray crystallo- 
graphy studies have shown that recoverin, a calcium sensor in vision, 
indeed contains EF hands (Figure 16-29), as was predicted from its amino 
acid sequence. 
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Figure 16-29 

The EF hands of calmodulin (blue) 
and recoverin (red) arc very similar 
in structure. Bound Ca 2+ is shown as 
a sphere. The mean difference in the 
position of main-chain atoms is 0.8 A. 
[Drawn from Scln.pdb. YS. Babu, 
C.E. Bugg, and W.J. Cook. /. MoL Biol 
204(1988):191;and lrec.pdb. 
■K.M. Flaherty, S. Zozulya, L. Strver, 
and D.B. McKay. Cell 75 (1993): 709.] 
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Figure 16-28 

Consensus sequence for calcium-binding EF hands in proteins. EF hands can be 
detected by scanning amino acid sequences for this 29-rcsidue motif. The symbol 
n (yellow) denotes a nonpolar side chain, and O (red) denotes an oxygen- 
containing side chain (Asp, Asn, Glu, Gin, Ser, Thr). The residues marked in red 
form the calcium-binding loop. 



Investigators are now tackling the most demanding problem, the pre- 
diction of structure in the absence of prior three-dimensional informa- 
tion about a related protein. By aligning many homologous sequences 
from differen t species, one can discern the pattern of essential hydropho- 
bic and hydrophilic residues. This pattern is the starting point in identify- 
ing folding units. The accurate prediction of much of the backbone struc- 
ture of the catalytic domain of protein kinase A before its crystal structure 
was solved is indicative of the rapid progress that is being made. 

PROTEIN DESIGN TESTS OUR GRASP OF BASIC PRINCIPLES 
AND CREATES USEFUL NEW MOLECULES 

The design and synthesis of novel proteins and of variations on nature's 
themes are important for several reasons: 

1. De novo synthesis tests our understanding of fundamental princi- 
ples. The ultimate criterion of die validity of a theory is its predictive 



